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J^BSTRACT 

The purpose of this study was to determine the 
PLoportion of words that the Dolch list and the word list for the 
1970* s accounted for in written materials. The two word lists were 
co»pared by token coiparison — a «ethod by which words are weighted on 
the basis of frequency of occurrence within a given set of aaterials. 
The written naterials used for the cosparisons were two large word 
count studies published within the past decade. In detemining the 
cumulative freguency of words on each list, regularly inflected forms 
wece combined. Calculating the cumulative frjguencies of both the 
word list for the 1970»s and th^ Dolch list in each of the two large 
word count studies made it possible to determine whether any 
differences existed and the statistical significance of the 
differences. The results indicated that the word list for the 1970 's 
accounted for a significantly greater proportion of words than the 
Dolch list in written materials encountered by both children and 
Adults. However, both lists accounted for over 50 percent of the 
words used in materials for children and adults. (WB) 
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There seems to be general agreement among reading authorities that the 
Dolch (1936) basic sight vocabulary of 220 words has received widespread use 
among teachers. Recently, however, criticisms (Johnson. 1971b; Otto & Chester. 
1972) have characterized the Dolch list with labels such as "passe" and "pseudo- 
empirical". These criticisms may. at least in part, be justified and there is 
little wonder that a number of researchers (Harris & Jacobson. 1972; Hillerich. 
1974; Johns, 1974; Johnson, 1971a; and Otto & Chester, 1972) have offered word 
lists to replace or revise Dolch* s original sight vocabluary. 

Background of Study 

Recently, Johns (1974) developed a basic word list for the 3 970»s which met 
several criteria. Specifically, the list contained words that occurred frequently 
in: 



1 
2, 
3. 
4. 



materials read by children in grades three through nine, 
materials read by adults, 

library books read by primary grade children. 

the spontaneous speaking vocabulary of children in kindergarten and 
first grade. 

In addition, the list contained no nouns and combined regularly inflected forms 
of a given root word. 



-2- 

Aff:er the word list for the 1970* s was compiled, it was assumed that the 
list would have high utility at all levels of reading development. Not stated, 
but certainly implied, was the assertion that the word list for thft 1970* s would 
be more useful than the Dolch list- One potential measure of "usefulness" is 
the number and per cent of words in textual materials that can be accounted for 
by a particular word list. Previous research (Dolch, 1948; Guszak, 1972; Johns, 
1971; and Zintz, 1972) has shown that the Dolch list is useful in that it accounts 
for over 50 per cent of the words in basal readers and othtr materials. If the 
word list for the 1970* s could account for a significantly greater proportion 
of words than the Dolch list, there would be ar empirical base for claiming 
that this recently compiled word list is more useful than the Dolch li»t» 

Purpose of Study 

The purpose of the present study was to determine the proportion cf words 
that the Dolch list and the word list for the 1970* s accounted for In written 
materials. Specifically, answers to the following two questions were sought t 

1. Does the word list for the 1970* s account for a greater proportioxi of 
words in materials commonly used by children than the Dolch list? Is 
the difference statistically significant? 

2. Does the word list for the 1970* s account for a greater proportion oi 
words in materials conmonly read by adults than the Dolch list? Is 
the difference statistically significant? 

Procedure for the Comparisons 
It was decided that the two word lists would be compared by what is commonl/ 
referred to as a token comparison. In a token comparison words are weighted on 
the basis of their frequency of occurrence within a given set of materials. It 
was assumed that the written materials used in the study were representative of 
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those words typically ancountered by children and adulta. The written mattrials 
used for the comparisons were two large word count studies published within 
the past decade. 

The first source utilized was the American Heritage Intermediate (AHI) 
Corpus published in the Word Frequency Book (Carroll, Davie, & Richman, 1971). 
The AHI Corpus was compiled from samples of published materials to which U.S. 
students are exposed in grades three through nine. The materials included 
textbookft, workbooks, kits, novels, poetry, general nonflction, encyclopedias, 
and magazines. The AHI Corpus contains 5, OSS, 721 words drawn in 500-word 
samples from 1.045 texts. There are 86,741 different words in the Corpus. 

'he second source was the Kucera-l* rancis (1967) C>rpus. The Corpus 
va.^ compiled from a wide body of "natural-language" adult published materials 
; ^ing from all kinds of newspaper writing to learned journal articles. The 
Kucera-Francls Corpus contains 1,014,232 words drawn in 500 samples of approx- 
imately 2,000 words each. There are 50,406 different words in the Corpus. 

In determining the cumulative frequency of words on each lisc, the in- 
vestigator combined regularly inflected forms. For this study the tana 
"regularly inflected" included these endings; e, ea, ed, er (as comparative, 
not agent), est, in£, (indicating possession or plurality, not contraction), 
sjs., and the dialectal ^in. In general, if the form of the root word wae kept 
intact, the inflected form was included. Examples here would be her-hers . it- 

little -littlest ; dialect forms such as know-knowed ; and misuses such as 
best-best's and _i for U Changes in meaning were not included, such as short- 
shorts, max-MaQT or new -news . Also, two inflected endings (wash-washings) were 
omitted as well as spelling changes which obliterated the root word (funny«fuonier . 



ride-riding, slt-sittins) . And, finally, archaic forms (the verb endings est, 
eth), alternate spellings (b^re for bjr), and misspellings were ruled out. 

Calculating the cumulative frequencies of the word list for the 1970* a and 
the Dolch list in each of the two large word count studies made it possible to 
determine whether any differences existed and the statistical si^Trfficance of 
the differences. 

Results 

Since the word list for the 1970»s and the Dolch list had lt>:* words in 
coninon, the cumulative frequencies of these words were determined in both the 
AHI Corpus and the Kucera-Francis Corpus. As indicated in Table 1, approximately 
2,763,051 of the 5,088,721 words in the AHI Corpus were accounted for by the 
189 words common to both lists, the 31 unique words in the Dolch list resulted 
in another 40,469 words for a cumulative frequency of 2,803,520 words. The 
word list for the 1970»s contained 37 unique words which accounted for an ad- 
ditional 130,113 words. Adding this figure to the 2,763,051 resulted in a 
cumulative frequency of 2,893,164 words. 

It was evident that the word list for the 1970*8 accounted for a greater 
proportion of words in the AHI Corpus than the Dolch list. To test whether or 
not this difference was statistically significant, a one-tailed test for pro- 
portions was conducted. As indicated in Table 1, differences significant beyond 
the .01 level existed for both unique words and the cumulative frequency of words. 
On the basis of frequency the word list for the 1970' s accounted for significantly 
more words in materials commonly used by children than the Dolch list. 

A similar procedure was used to compare the Dolch list and the word list for 
the 1970* s to the Kucera-Francis Corpus. As shown in Table 2, the word list for 
the 1970' s accounted for a greater proportion of words in the Kucera-Francis Corpus 



than the Dolch list. These differences were altjo tested for significance 

using one-tailed tests for proportions. The results were significant beyond 

the .01 level for both unique words and the cumulative frequency of words • 

On the basis of frequency, the word list for the 1970*8 accounted for significantly 

more words in adult materials than the Dolzh list. 

Discussion, Conclusions > and Challange to the Future 

The results of this study offer evidence that the word list for the 1970's 
accounts for a significantly greater proportion of words than the Dolch list in 
written materials encountered by both children and adults. If one uses the 
criterion of frequency, it is clear that the word list for the 1970* s is 
statistically superior to the Dolch list • Not suxprisingly^ the 

word list for the 1970* s (see Table 3) accounted for a greater percentage of 
words than the Dolch list in both the AHI Corpus and the Kucera-Francls Corpus. 
It should be pointed out, however, that both lists accounted for over 50 per cent 
of the words used in materials for children and adults • 

Several conclusions are Justified from the results of this study. First, 
the word list for the 1970*8 is, in fact, more useful than the Dolch list if 
the criterion of frequency is employed. This conclusion was also supported by 
several informal checks of different basal readers series frequently used in 
today's schools. Second, the finding that the Dolch list still accounts for 
over 55 per cent of the words used in materials written for children in grades 
three through nine and for over 50 per cent of the words frequently used in so- 
called adult materials offers little evidence to critics who claim that the Dolch 
list is passe. Certainly the vant majority of the Dolch words have withstood 
the test of time. Finally, although the word list for the 1970*8 is statistically 



significant to the Dolch list, one must wonder about the practical significance 
of the difference. 

It is altogether possible that there has been too much attention to developing 
word lists and not enough attention to how word lists facilitate the effective 
teaching of reading, A child will not become an effective reader unless he 
develops a large sight vocabulary - and it is obvious that the Dolch list 
and other recently published word lists contain many of those words. On the 
other hand, it is clear that knowing 220 or some magic number of words is not 
a sufficient condition to become an effective reader. 

Word count studies can be used to demonstrate that the child who knows * 
only thirteen words will be equipped to deal with approximately 25 per cent of the 
vords he meets in print. While this reduces the burden of unknown wordo for the 
child, it is a far cry from making him a proficient reader. Knowing a hundred 
words will account for 50 per cent of the running words, but, once again, this 
will not make the child an efficient reader. Even knowing 2,500 words still 
leaves the child with approximately one unknown word in every four in a natural 
reading situation. Clearly, word lists quickly reach a point of diminishing 
returns. 

These general comments on word lists are Intended to stimulate a careful 
examination of why word lists are developed. Merely because someone develops 
a new word list Is not, in Itself, a sufficient reason for using It. The same 
contention is also directed at "older" word lists - like the Dolch list. Perhaps 
It Is time to stop developing new word lists and begin to seek answers to some 
of the following questions: 



1. What role do sight words play in the acquisition of effective and 
efficient reading? 

2. What methods for teaching sight worda supported by research? 

3. What values do word lists have for the teaching of reading? 

4. How are high frequency words best learned by individuals attempting 
to become efficient readers? 

Answers to these and related questions may help recently published word lists 
take on a new meaniiuK. 



Table 1 



Cumulative Frequencies of Words in th« AHI Corpus for tha 
Dolch List and the Word List for the 1970* s 



Word Words Coasnon Words Unique Cuaulative 

Ll^t to Both Lists to Both Lists z Frequency z 

Dolch List 2,763,051 40,469 2,803.520 

(220 words) 

Word List for 1970*8 2,763.051 130,113 216.7* 2,893,164 217.1* 

(226 words) 



♦significant beyond the .01 level 



Table 2 



Cumulative Frequencies of Wordg in th« Kucera-Francis Corpus for tha 
Dolch List and the Word List for the 1970*8 



Word Words Coooson Words Unique Cumulative 

T.-i to Both Lists to Both Lists z Frequency z 

Dolch List 515,035 3,836 516,871 

(220 words) 

Word List for 1970*s 515,035 22,878 119.54* 537,913 116.5* 

(226 words) 



*signif leant beyond the .01 level 



Table 3 



Approximate Percentage of Worda in the AHI Corpus and tha 
Kucera-Francla Corpus Accounted for by the Dolch List and the 

Word Uat for the 19>v a 



Per Cent of Worda 

Dolch Llat Word Llat for 

the 1970* a 



AHI Corpus 



Kucera-Francia Corpus 



55,09 
31.16 



56.85 
33.04 
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